AITopics | normalization condition

Collaborating Authors

normalization condition

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Extension of Symmetrized Neural Network Operators with Fractional and Mixed Activation Functions

Santos, Rômulo Damasclin Chaves dos, Sales, Jorge Henrique de Oliveira

arXiv.org Machine LearningJan-17-2025

We propose a novel extension to symmetrized neural network operators by incorporating fractional and mixed activation functions. This study addresses the limitations of existing models in approximating higher-order smooth functions, particularly in complex and high-dimensional spaces. Our framework introduces a fractional exponent in the activation functions, allowing adaptive non-linear approximations with improved accuracy. We define new density functions based on $q$-deformed and $\theta$-parametrized logistic models and derive advanced Jackson-type inequalities that establish uniform convergence rates. Additionally, we provide a rigorous mathematical foundation for the proposed operators, supported by numerical validations demonstrating their efficiency in handling oscillatory and fractional components. The results extend the applicability of neural network approximation theory to broader functional spaces, paving the way for applications in solving partial differential equations and modeling complex systems.

artificial intelligence, machine learning, operator, (10 more...)

arXiv.org Machine Learning

2501.10496

Genre: Research Report (0.64)

Industry:

Telecommunications > Networks (0.64)
Information Technology > Networks (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Universal scaling laws in quantum-probabilistic machine learning by tensor network towards interpreting representation and generalization powers

Bai, Sheng-Chen, Ran, Shi-Ju

arXiv.org Artificial IntelligenceOct-12-2024

Interpreting the representation and generalization powers has been a long-standing issue in the field of machine learning (ML) and artificial intelligence. This work contributes to uncovering the emergence of universal scaling laws in quantum-probabilistic ML. We take the generative tensor network (GTN) in the form of a matrix product state as an example and show that with an untrained GTN (such as a random TN state), the negative logarithmic likelihood (NLL) $L$ generally increases linearly with the number of features $M$, i.e., $L \simeq k M + const$. This is a consequence of the so-called ``catastrophe of orthogonality,'' which states that quantum many-body states tend to become exponentially orthogonal to each other as $M$ increases. We reveal that while gaining information through training, the linear scaling law is suppressed by a negative quadratic correction, leading to $L \simeq \beta M - \alpha M^2 + const$. The scaling coefficients exhibit logarithmic relationships with the number of training samples and the number of quantum channels $\chi$. The emergence of the quadratic correction term in NLL for the testing (training) set can be regarded as evidence of the generalization (representation) power of GTN. Over-parameterization can be identified by the deviation in the values of $\alpha$ between training and testing sets while increasing $\chi$. We further investigate how orthogonality in the quantum feature map relates to the satisfaction of quantum probabilistic interpretation, as well as to the representation and generalization powers of GTN. The unveiling of universal scaling laws in quantum-probabilistic ML would be a valuable step toward establishing a white-box ML scheme interpreted within the quantum probabilistic framework.

artificial intelligence, machine learning, orthogonality, (14 more...)

arXiv.org Artificial Intelligence

2410.09703

Country: Asia > China > Beijing > Beijing (0.04)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

2 Matching Law

Neural Information Processing SystemsMar-15-2024, 16:02:45 GMT

This outcome corresponds to the undermatching phenomenon, which has been observed in behavioral experiments. Our results suggest that when we discuss the learning processes in a decision making network, it may be insufficient to only consider a steady state for individual weight updates, and we should therefore consider the dynamics of the weight distribution and the network architecture. This proceeding is a short version of our original paper [12], with the model modified and new results included. First, let us formulate the matching law. We will consider a case with two alternatives (each denoted as A and B), which has generally been studied in animal experiments.

choice probability, normalization, rm-hebb rule, (16 more...)

Neural Information Processing Systems

Country:

North America > United States > New York (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report > New Finding (1.00)

Industry: Health & Medicine (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Learning principle and mathematical realization of the learning mechanism in the brain

Katayose, Taisuke

arXiv.org Machine LearningNov-22-2023

While deep learning has achieved remarkable success, there is no clear explanation about why it works so well. In order to discuss this question quantitatively, we need a mathematical framework that explains what learning is in the first place. After several considerations, we succeeded in constructing a mathematical framework that can provide a unified understanding of all types of learning, including deep learning and learning in the brain. We call it learning principle, and it follows that all learning is equivalent to estimating the probability of input data. We not only derived this principle, but also mentioned its application to actual machine learning models. For example, we found that conventional supervised learning is equivalent to estimating conditional probabilities, and succeeded in making supervised learning more effective and generalized. We also proposed a new method of defining the values of estimated probability using differentiation, and showed that unsupervised learning can be performed on arbitrary dataset without any prior knowledge. Namely, this method is a general-purpose machine learning in the true sense. Moreover, we succeeded in describing the learning mechanism in the brain by considering the time evolution of a fully or partially connected model and applying this new method. The learning principle provides solutions to many unsolved problems in deep learning and cognitive neuroscience.

loss function, normalization condition, probability, (14 more...)

arXiv.org Machine Learning

2311.13341

Genre: Research Report (0.40)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.54)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.75)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.49)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.49)

Add feedback

Making Linear MDPs Practical via Contrastive Representation Learning

Zhang, Tianjun, Ren, Tongzheng, Yang, Mengjiao, Gonzalez, Joseph E., Schuurmans, Dale, Dai, Bo

arXiv.org Artificial IntelligenceDec-7-2022

It is common to address the curse of dimensionality in Markov decision processes (MDPs) by exploiting low-rank representations. This motivates much of the recent theoretical study on linear MDPs. However, most approaches require a given representation under unrealistic assumptions about the normalization of the decomposition or introduce unresolved computational challenges in practice. Instead, we consider an alternative definition of linear MDPs that automatically ensures normalization while allowing efficient representation learning via contrastive estimation. The framework also admits confidence-adjusted index algorithms, enabling an efficient and principled approach to incorporating optimism or pessimism in the face of uncertainty. To the best of our knowledge, this provides the first practical representation learning method for linear MDPs that achieves both strong theoretical guarantees and empirical performance. Theoretically, we prove that the proposed algorithm is sample efficient in both the online and offline settings. Empirically, we demonstrate superior performance over existing state-of-the-art model-based and model-free algorithms on several benchmarks.

artificial intelligence, machine learning, reinforcement learning, (12 more...)

arXiv.org Artificial Intelligence

2207.0715

Country:

North America > Canada > Alberta (0.14)
Asia > Middle East > Jordan (0.04)
North America > United States > Maryland > Baltimore (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (1.00)

Industry: Education (0.93)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)

Add feedback

From influence diagrams to multi-operator cluster DAGs

Pralet, Cedric, Schiex, Thomas, Verfaillie, Gerard

arXiv.org Artificial IntelligenceJun-27-2012

There exist several architectures to solve influence diagrams using local computations, such as the Shenoy-Shafer, the HUGIN, or the Lazy Propagation architectures. They all extend usual variable elimination algorithms thanks to the use of so-called 'potentials'. In this paper, we introduce a new architecture, called the Multi-operator Cluster DAG architecture, which can produce decompositions with an improved constrained induced-width, and therefore induce potentially exponential gains. Its principle is to benefit from the composite nature of influence diagrams, instead of using uniform potentials, in order to better analyze the problem structure.

architecture, computation node, influence diagram, (12 more...)

arXiv.org Artificial Intelligence

1206.6844

Country: Europe > France > Occitanie > Haute-Garonne > Toulouse (0.04)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

Effects of Synaptic Weight Diffusion on Learning in Decision Making Networks

Katahira, Kentaro, Okanoya, Kazuo, Okada, Masato

Neural Information Processing SystemsDec-31-2010

When animals repeatedly choose actions from multiple alternatives, they can allocate their choices stochastically depending on past actions and outcomes. It is commonly assumed that this ability is achieved by modifications in synaptic weights related to decision making. Choice behavior has been empirically found to follow Herrnstein’s matching law. Loewenstein & Seung (2006) demonstrated that matching behavior is a steady state of learning in neural networks if the synaptic weights change proportionally to the covariance between reward and neural activities. However, their proof did not take into account the change in entire synaptic distributions. In this study, we show that matching behavior is not necessarily a steady state of the covariance-based learning rule when the synaptic strength is sufficiently strong so that the fluctuations in input from individual sensory neurons influence the net input to output neurons. This is caused by the increasing variance in the input potential due to the diffusion of synaptic weights. This effect causes an undermatching phenomenon, which has been observed in many behavioral experiments. We suggest that the synaptic diffusion effects provide a robust neural mechanism for stochastic choice behavior.

artificial intelligence, choice probability, machine learning, (16 more...)

Neural Information Processing Systems

Country: Asia > Japan > Honshū > Kantō (0.14)

Genre: Research Report > New Finding (0.89)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.55)

Add feedback